sinkhorn barycenter
Sinkhorn Barycenters with Free Support via Frank-Wolfe Algorithm
Giulia Luise, Saverio Salzo, Massimiliano Pontil, Carlo Ciliberto
We present a novel algorithm to estimate the barycenter of arbitrary probability distributions with respect to the Sinkhorn divergence. Based on a Frank-Wolfe optimization strategy, our approach proceeds by populating the support of the barycenter incrementally, without requiring any pre-allocation.
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Texas (0.04)
- (6 more...)
Sinkhorn Barycenter via Functional Gradient Descent
In this paper, we consider the problem of computing the barycenter of a set of probability distributions under the Sinkhorn divergence. This problem has recently found applications across various domains, including graphics, learning, and vision, as it provides a meaningful mechanism to aggregate knowledge. Unlike previous approaches which directly operate in the space of probability measures, we recast the Sinkhorn barycenter problem as an instance of unconstrained functional optimization and develop a novel functional gradient descent method named \texttt{Sinkhorn Descent} (\texttt{SD}). We prove that \texttt{SD} converges to a stationary point at a sublinear rate, and under reasonable assumptions, we further show that it asymptotically finds a global minimizer of the Sinkhorn barycenter problem. Moreover, by providing a mean-field analysis, we show that \texttt{SD} preserves the {weak convergence} of empirical measures. Importantly, the computational complexity of \texttt{SD} scales linearly in the dimension $d$ and we demonstrate its scalability by solving a $100$-dimensional Sinkhorn barycenter problem.
Sinkhorn Barycenters with Free Support via Frank-Wolfe Algorithm
Giulia Luise, Saverio Salzo, Massimiliano Pontil, Carlo Ciliberto
We present a novel algorithm to estimate the barycenter of arbitrary probability distributions with respect to the Sinkhorn divergence. Based on a Frank-Wolfe optimization strategy, our approach proceeds by populating the support of the barycenter incrementally, without requiring any pre-allocation.
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Texas (0.04)
- (6 more...)
Review for NeurIPS paper: Sinkhorn Barycenter via Functional Gradient Descent
Weaknesses: The constants in the bounds depend linearly on the dimension, although they depends exponentially on the regularization parameter. If Sinkhorn distance is thought as a proxy of the Wasserstein distance, this seems to be a hidden dependance on the dimension, since the regularization parameter plays the role of an interpolation between MMD and Wasserstein distances, and MMD distances are more blind to the dimension. This is not discussed in the paper. The results also have an exponential dependence on an assumed uniform upper bound on the cost. For the classical quadratic cost, this imply an exponential dependence on the dimension for the case of measures supported on [0,1] d for instance.
Review for NeurIPS paper: Sinkhorn Barycenter via Functional Gradient Descent
This paper proposes a new method to compute the (Sinkhorn) of barycenter of several probability measures. In practice, the method scales well computationally and in high-dimension and the authors provide some theoretical support. Reviewers agree that this paper is strong with only minor weaknesses (such as the exponential dependency in 1/gamma; which in related work is often suboptimal). I thus recommend accept (poster).
Sinkhorn Barycenters with Free Support via Frank-Wolfe Algorithm
Luise, Giulia, Salzo, Saverio, Pontil, Massimiliano, Ciliberto, Carlo
We present a novel algorithm to estimate the barycenter of arbitrary probability distributions with respect to the Sinkhorn divergence. Based on a Frank-Wolfe optimization strategy, our approach proceeds by populating the support of the barycenter incrementally, without requiring any pre-allocation. We consider discrete as well as continuous distributions, proving convergence rates of the proposed algorithm in both settings. Key elements of our analysis are a new result showing that the Sinkhorn divergence on compact domains has Lipschitz continuous gradient with respect to the Total Variation and a characterization of the sample complexity of Sinkhorn potentials. Experiments validate the effectiveness of our method in practice.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Texas (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- (6 more...)